Jozef Stefan International Postgraduate School

نویسندگان

  • Luis Rei
  • Dunja Mladenic
چکیده

Despite the general acceptance that the success of machine learning algorithms generally depends on data representation, most NLP and text mining systems and techniques still treat words as indices in a vocabulary, devoid of meaning. We believe that discarding semantic and syntactical similarity between words results in the loss of explanatory factors that would contribute to the success of the above mentioned systems and techniques. In text mining, the common representations also ignore word order. In this work we analyze continuous vector representations of words which preserve semantic and syntactic regularities as well as techniques for composing over them which are sensitive to word order. We show how these have been used to improve Language models, NLP, machine translation, information retrieval and text mining systems. Finally, we present the shared tasks used to evaluate the representations and their compositionality and discuss directions for further work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clocking the Onset of Bilayer Coherence in a High-TC Cuprate Supplemental Material

Edoardo Baldini, 2 Andreas Mann, Benjamin P. P. Mallett, Christopher Arrell, Frank van Mourik, Thomas Wolf, Dragan Mihailovic, Jeffrey L. Tallon, Christian Bernhard, José Lorenzana, and Fabrizio Carbone Laboratory for Ultrafast Microscopy and Electron Scattering, IPHYS, EPFL, CH-1015 Lausanne, Switzerland Laboratory of Ultrafast Spectroscopy, ISIC, EPFL, CH-1015 Lausanne, Switzerland Department...

متن کامل

Dealing with spatial autocorrelation when learning predictive clustering trees

a Jožef Stefan Institute, Department of Knowledge Technologies, Jamova cesta 39, 1000 Ljubljana, Slovenia b Jožef Stefan International Postgraduate School, Jamova 39, 1000 Ljubljana, Slovenia c Dipartimento di Informatica, Università degli Studi di Bari “Aldo Moro”, via Orabona 4, 70125 Bari, Italy d Centre of Excellence for Integrated Approaches in Chemistry and Biology of Proteins, Jamova 39,...

متن کامل

Contrasting Subgroup Discovery

Department of Computer Science and Helsinki Institute for Information Technology HIIT, University of Helsinki, Finland Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia International Postgraduate School Jožef Stefan, Ljubljana, Slovenia Department of Biotechnology and Systems Biology, National Institute of Biology, Ljubljana, Slovenia Email: {laura.langohr, hannu...

متن کامل

Contents Advice to Postgraduate Students

The University of Sydney’s school of chemistry is one of the largest chemistry departments in Australia with a strong record of achievement and an international research reputation. There are typically around 100 postgraduate students undertaking research towards doctorate and masters degrees. The school offers postgraduate programs in all areas of contemporary chemistry leading to the followin...

متن کامل

Organizers

Program Committee Liz Bradley (University of Colorado, Boulder, USA) Gully Burns (USC Information Sciences Institute, USA) Ishanu Chattopadhyay (Cornell University, USA) Tim Clark (Harvard University, USA) Anita de Waard (Elsevier) Michel Dumontier (Stanford University, USA) Saso Dzeroski (Jozef Stefan Institute, Slovenia) Susan Epstein (e CUNY Graduate School and Hunter College, USA) Ashok Go...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014